home *** CD-ROM | disk | FTP | other *** search
- Path: anvil.ugrad.cs.ubc.ca!not-for-mail
- From: c2a192@ugrad.cs.ubc.ca (Kazimir Kylheku)
- Newsgroups: comp.lang.c
- Subject: Re: [Perf:] mem*() procs vs. array looping
- Date: 28 Feb 1996 12:24:29 -0800
- Organization: Computer Science, University of B.C., Vancouver, B.C., Canada
- Message-ID: <4h2dltINNlag@anvil.ugrad.cs.ubc.ca>
- References: <4glkq1$gu7@gazette.tandem.com> <4h1n14$3b3@news.interpath.net>
- NNTP-Posting-Host: anvil.ugrad.cs.ubc.ca
-
- In article <4h1n14$3b3@news.interpath.net>,
- Scott McMahan - Softbase Systems <softbase@mercury.interpath.net> wrote:
- >Francis E. Chang (francis@patch.tandem.com) wrote:
- >
- >: Are mem*() procedures performance boosters?
- >
- >I have had some experience with this exact question, and want to
- >share what I found out.
- >
- >1. The only real answer is, "it depends". It depends on who wrote the
- >stdlib routines, how conscientious they were, how much they knew about
- >the hardware they were writing on, etc. From one library to the
- >next, the answer could change.
-
- What makes you think that memcpy() and friends are necessarily function calls
- to library routines? Since they are standard defined functions, the compiler
- has the license to generate inline code for a reference to them, provided that
- they have not been redefined as internal objects (that is, static functions)
- inside the translation unit in the scope of the reference, and you are using a
- hosted C environment. You can do such redefinition of a standard function such
- as memcpy() if you don't #include the header which declares it as an extern
- object, and if you write it as a static. In that case, the compiler will
- generate code that calls _your_ memcpy() rather than make inline code which
- implements the standard memcpy().
-
- >2. The only way to tell is on a per-stdlib basis! Measuring execution
- >of the program in a proflier for every different memxxx in every
- >different library. You *have* to profile the program and see what
- >is going on. Does calling memxxx even matter in the overall scheme?
-
- You don't have to profile. This is far too simple to require profiling.
- Timing a large number of memcpy() operations should be quite telling.
-
- >3. Most if not all commercial compilers will write things like memcpy
- >in assembly language, and some processors have native instructions for
- >doing memory copies and stuff that make them much faster than any C
- >code you could write, because they don't have to continually load
- >addresses like you do in a loop.
-
- Not only that, but some machines have special co-processors for doing these
- kinds of blits (e.g. Amiga, some Suns).
-
- >4. I had a program where memset was THE bottleneck, taking up more time
- >than I/O. I re-wrote a memory zeroing function using the most unrolled,
- >efficient loop I could write, and it was still orders of magnitude
- >slower than the system memset. No way I could make it faster in C.
-
- The compiler may have generated an in-line code utilizing the best idiom for
- the architecture. Wins hands down, unless you spoil the standard compliance of
- your code with your own explicit inline assembly language that is even better.
- Generally not worth it.
-
- >5. WRT #4, I removed the memset. It was initializing a buffer, and I
- >said "forget it" and used the uninitialized buffer and took my chances.
- >Changing the design was more effecient than coding hokey
- >optimizations! After removing memset, I/O functions accounted for
- >90% or more of the time spent in the program, which I could live with.
-
- Did you assume anything about the uninitialized contents? That is never a good
- thing for auto or dynamically allocated storage. Though if you need the speed,
- you can always try sacrificing portability.
-
- >Scott
- >
- --
-
-